Combining Learning and Word Sense Disambiguation for Intelligent User Profiling

نویسندگان

  • Giovanni Semeraro
  • Marco Degemmis
  • Pasquale Lops
  • Pierpaolo Basile
چکیده

Understanding user interests from text documents can provide support to personalized information recommendation services. Typically, these services automatically infer the user profile, a structured model of the user interests, from documents that were already deemed relevant by the user. Traditional keyword-based approaches are unable to capture the semantics of the user interests. This work proposes the integration of linguistic knowledge in the process of learning semantic user profiles that capture concepts concerning user interests. The proposed strategy consists of two steps. The first one is based on a word sense disambiguation technique that exploits the lexical database WordNet to select, among all the possible meanings (senses) of a polysemous word, the correct one. In the second step, a naı̈ve Bayes approach learns semantic sensebased user profiles as binary text classifiers (userlikes and user-dislikes) from disambiguated documents. Experiments have been conducted to compare the performance obtained by keyword-based profiles to that obtained by sense-based profiles. Both the classification accuracy and the effectiveness of the ranking imposed by the two different kinds of profile on the documents to be recommended have been considered. The main outcome is that the classification accuracy is increased with no improvement on the ranking. The conclusion is that the integration of linguistic knowledge in the learning process improves the classification of those documents whose classification score is close to the likes / dislikes threshold (the items for which the classification is highly uncertain).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Semantically Indexed Documents for Intelligent User Profiling

Typically, personalized information recommendation services automatically infer a user profile, a structured model of the user interests, from documents the user already deemed as relevant. Traditional keyword-based approaches are unable to capture the semantics of the user interests. This work proposes a strategy consisting of two steps. The first one is a semantic indexing procedure based on ...

متن کامل

Word Sense Disambiguation for Vocabulary Learning

Words with multiple meanings are a phenomenon inherent to any natural language. In this work, we study the effects of such lexical ambiguities on second language vocabulary learning. We demonstrate that machine learning algorithms for word sense disambiguation can induce classifiers that exhibit high accuracy at the task of disambiguating homonyms (words with multiple distinct meanings). Result...

متن کامل

Design and implementation of Persian spelling detection and correction system based on Semantic

Persian Language has a special feature (grapheme, homophone, and multi-shape clinging characters) in electronic devices. Furthermore, design and implementation of NLP tools for Persian are more challenging than other languages (e.g. English or German). Spelling tools are used widely for editing user texts like emails and text in editors.  Also developing Persian tools will provide Persian progr...

متن کامل

An Intelligent Personalized Service for Conference Participants

This paper presents the integration of linguistic knowledge in learning semantic user profiles able to represent user interests in a more effective way with respect to classical keyword-based profiles. Semantic profiles are obtained by integrating a näıve Bayes approach for text categorization with a word sense disambiguation (WSD) strategy based on the WordNet lexical database (Section 2). Sem...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007